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Summary: We prove a multivariate version of Bernstein's inequality about the 
probability that degenerate [/-statistics take a value larger than some num- 
ber u. This is an improvement of former estimates for the same problem which 
yields an asymptotically sharp estimate for not too large numbers u. This 
paper also contains an analogous bound about the distribution of multiple 
Wiener-Ito integrals. Their comparison shows that our results are sharp. The 
proofs are based on good estimates about high moments of multiple random 
integrals. They are obtained by means of a diagram formula which enables 
us to express the product of multiple random integrals as the sum of such 
expressions. 
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1. Introduction. 

Let us consider a sequence of iid. random variables ■ ■ on a measurable space 

(X, X) with some distribution \x together with a real valued function / = f(xi, . . . , x k ) 
of k variables defined on the k-th power (X k , X k ) of the space (X, X) and define with 
their help the [/-statistics I n ,k(f), n = k,k + 1, . . . , 

In,k(f) = ^ E /(& (1-1) 

l<j s <n, s=l,...,k 
js^j s ' if s^s' 

We are interested in good estimates on probabilities of the type P [n~ h / 2 k\\I n ^{f)\ > u) 
under appropriate conditions. 

Arcones and Gine in [1] have proved an inequality which can be written in a slightly 
different but equivalent form as 

P (k\rC k ' 2 \I n k(f)\>u)< Cl exp I -. ^— \ 

V ' UJI )~ P \ a^{l + c 3 (un-^a-(^) 2/Hk+1) )j (1-2) 

for all u > 

with some universal constants c±, C2 and cs depending only on the order k of the U- 
statistic I n ,k(f) defined in (1.1) if the function / satisfies the conditions 

11/1100= sup |/(xi,...,x fc )| < 1, (1.3) 

XjEX, l<j<k 

\\f\\l = J f 2 (x 1 ,...,x k )^dx 1 )... f x(dx k )<a 2 , (1.4) 

and it is canonical with respect to the probability measure /z, i.e. 

J f(xi,...,Xj-i,u,Xj+i,...,Xk)fi(du) = for all 1 < j < k 

and x s <E X, se{l,..i}\ {j}. 

A [/-statistic defined in (1.1) with the help of a canonical function / is called degenerate 
in the literature. A degenerate U -statistic is the natural multivariate version of sums of 
iid. random variables with expectation zero. 

Arcones and Gine called their estimate (1.2) a new Bernstein- type inequality. The 
reason for such a name is that the original Bernstein inequality (see e.g. [3], 1.3.2 
Bernstein inequality) states relation (1.2) in the case k = 1 with the constants c\ = 

2, C2 = \ and C3 = 3 if the function f(x) satisfies the conditions sup|/(x)| < 1, 
/ f(x)fi(dx) =0 and j f 2 (x)/j,(dx) < a 2 . 

Let us fix a number C > 0. Formula (1.2) states in particular that for all numbers 
< u < Cn k / 2 a k+1 and degenerate [/-statistics I n ,k(f) °f or der k with a kernel func- 
tion / satisfying relations (1.3) and (1.4) the inequality P (n~ k l 2 k\\I n) h(f)\ > u) < 
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Aexp j— B (^) 2 ^| holds with some appropriate constants A = A(C,k) and B = 

B(C, k) depending only on the fixed number C and the order k of the degenerate 
[/-statistics. This inequality can be interpreted in the following way: Let us take a ran- 
dom variable r\ with standard normal distribution. Then P (n~ k / 2 k\\I nj k(f)\ > u) < 



A 

> u ), at least for < u < Cn h ' 2 a k+1 . Let us also observe that un- 



AP . 

/2B 

der condition (1.4) the variance of n~ k / 2 k\I n) k{f) is bounded by k\a 2 , and if the ker- 
nel function / is symmetric and there is identity instead of inequality in (1.4), then 
lim Var (n- k / 2 k\I n , k (f)) =kla 2 . 

In the above discussion we have considered the probability P {n~ k ^ 2 k\\I ni k{f) I > u ) 
only for < u < Cn k / 2 a k+1 , while formula (1.2) yields an estimate for such a probability 
for all u > 0. On the other hand, as I shall show later, the above restriction of the 
parameter u does not mean an important loss of information. 

Bernstein's inequality yields an analogous estimate in the case of degenerate U- 

n 

statistics of order 1, i.e. for sums Yl f(€j) with a sequence of iid. random variables 

£i, . . . ,£ n an d a function f(x) whose absolute value is bounded by 1, Ef(£i) = and 
Ef(^i) 2 = o 2 . (Actually, Bernstein's inequality is more general, it also yields a bound 
for the distribution of a sum of independent, not necessarily identically distributed ran- 
dom variables.) But Bernstein's inequality also contains some additional information. 



It states that if < u < en x l 2 a 2 with a small e > 0, then P n 1 / 2 



> u < 



P((l — Ce)a\t]\ > u) with an appropriate constant C > 0. Since n 1 / 2 f(£j) has 

j=l 

expectation zero and variance a 2 the above inequality can be interpreted in such a 
way that at not too large values u the distribution function of the normalized sum 

n 

n -i/2 ^ f(£j) can be bounded by the distribution of a normal random variable with 

i=i 

expectation zero and only slightly smaller variance. The main goal of this paper is to 
show that a similar estimate holds for degenerate [/-statistics of any order. 

To carry out such a program first we have to find a good multivariate analog of 
Gaussian random variables. It is natural to consider multiple Wiener-Ito integrals 
which also appear as the limit of normalized degenerate [/-statistics as the sample size 
tends to infinity. (See e.g. [4]). We shall prove an estimate about the distribution of 
multiple Wiener-Ito integrals in Theorem 1 and show in Example 2 that this estimate 
is sharp. The main result of this paper is Theorem 3 which yields an estimate about 
the tail behaviour of degenerate [/-statistics. Its comparison with Theorem 1 shows 
that Theorem 3 provides an asymptotically sharp estimate on the tail distribution of a 
degenerate [/-statistic for not too large values. 

To formulate Theorem 1 let us take a cr-finite measure \i on the space (X, X) and 
a white noise fiw with counting measure \i on (X, X), i.e. a set of jointly Gaussian 
random variables fiw(A), A e X, such that E/j, w (A) = 0, Ejj, w {A)^ w {B) = /j(A fl B) 
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for all A G X and B G X. (We also need the identity Hw(A U B) = /j, w (A) + fJ,w(B) 
with probability 1 if A n i? = 0, but this is the consequence of the previous properties 
of the white noise. Indeed, they imply that E [nwyA US) — (/j,w(A) + fj,w(B))] = 
if A fl = 0, hence the desired identity holds.) The /c-fold Wiener-Ito integral of a 
function / 

■//*,*(/) = y /(»i,---,aJfc)A«w(da;i)---A»w(da:jfc) (1-5) 

can be defined with respect to a white noise /itw with counting measure fi if / is a 
measurable function on the space (A" , ), and it satisfies relation (1.4) with some 
a 2 < oo. (See e.g [6] or [7].) The expression J M) /e(/) in formula (1.5) will be called a 
Wiener-Ito integral of order k. Our first result is the following estimate which is an 
improvement of the upper bound given in Theorem 6.6 of [7]. 

Theorem 1. Let us consider a a -finite measure \x on a measurable space together 
with a white noise nw with counting measure \i. Let us have a real-valued function 
/(xi, . . . , Xk) on the space (X h ,X k ) which satisfies relation (1.4) with some a 2 < oo. 
Take the random integral J M ,/c(/) introduced in formula (1.5). This random integral 
satisfies the inequality 

P(k\\J^ k (f)\>u)<Cexpl--(-) j forallu>0 (1.6) 

with an appropriate constant C = C(k) > depending only on the multiplicity k of the 
integral. 

The following example shows that the estimate of Theorem 1 is sharp. 

Example 2. Let us have a a-finite measure fx on some measure space (X, X) to- 
gether with a white noise fiw on (X,X) with counting measure \x. Let fo(x) be a 
real valued function on (X,X) such that f /o(x) 2 /x( dx) = 1, and take the function 
f(xi, . . . , Xk) = crfo(xi) ■ ■ ■ fo(xk) with some number a > and the Wiener-Ito integral 
J/j,,k(f) introduced in formula (1.5). 

Then the relation J f(x±, . . . , xu) 2 //( dx\) ...//( dxk) = o 2 holds, and the random 
integral J M) /e(/) satisfies the inequality 

P(k\\J^f)\>u)>-— T7 - k exp|--(-J j forallu>0 (1.7) 

with some constant C > 0. 

Proof of the statement of Example 2: We may restrict our attention to the case k > 2. 
Ito's formula (see [6] or [7]) states that the random variable klJ^^if) can be expressed 
as klJ^kif) = crHk (J fo(x)fJ>w( dx)) = crHk(r)), where Hf.(x) is the k-th Hermite poly- 
nomial with leading coefficient 1, and rj = J fo(x)/iw( dx) is a standard normal random 
variable. Hence we get by exploiting that the coefficient of x k ~ x in the polynomial 
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H k {x) is zero that P(k\\J^ k (f)\ > u) = P(\H k (ri)\ > ±) > P (\ V k \ - D\ V k - 2 \ > 
with a sufficiently large constant D > if - > 1. There exist such positive constants A 

and B that P (\r] k \ - D\n k ~ 2 \ > ^) > P (\r} k \ > % + A ^ k ~ 2 ^ k ^ if ^ > B. 



Hence 



with an appropriate C > if ^ > £?. Since P(/c!| J^^if) I > 0) > 0, the above inequality 
also holds for < ^ < £? if the constant C > is chosen sufficiently small. This means 
that relation (1.7) holds. 

The main result of this paper is the following 

Theorem 3. Let £i, . . . ,£ n be a sequence of iid. random variables on a space (X, X) 
with some distribution fx. Let us consider a function f(xi, . . . ,x k ) canonical with re- 
spect to the measure (i on the space (X k ,X k ) which satisfies conditions (1-3) and (1-4) 
with some < a 2 < 1 together with the degenerate U -statistic I n ,k(f) with this kernel 
function. There exist some constants A = A(k) > and B = B(k) > depending only 
on the order k of the U -statistic I n ,k(f) such that 

P{k\n- k ' 2 \I n , k {f)\ >u)< Ae W { , ) (1.8) 



2a 2 / k (l + B (un- k / 2 a-( k + 1 )) 1/k ^j 



for all0<u< n k / 2 a k+1 . 



Remark: Actually, the universal constant B > can be chosen independently of the 
order k of the degenerate £/"-statistic I n ,k(f) in inequality (1.8). 

Theorem 3 states in particular that if < u < en k / 2 a k+1 with a sufficiently small 

e > 0, then P(kln~ k/2 \I n}k (f)\ > u) < ^lexp j-^^- (^) 2/fc | with some universal 

constants A > and C > depending only on the order k of the [/-statistic I n ,k(f)- 
A comparison of this result with Theorem 1 and Example 2 shows that for small e > 
this estimate yields the right order in the exponent in first order. This means that for 
not too large numbers u inequality (1.8) yields an asymptotically optimal estimate. 

To understand the previous statement better we can make the following observation: 
Let us have a probability measure [i on some measurable space (X, X) together with a 
sequence of iid. random variables ■ ■ ■ , with distribution fx, a real-valued function 

fo(x) on (X, X) such that J fo(x)(i(dx) = 0, f f$ (x)li( dx) = 1 and a real number a. 
Let us introduce the function f(x\, . . . , x k ) = cr/o(xi) • • • fo(x k ) on (X fc , X k ) and the U- 
statistics I n ,k(f)^ n = 1, 2, . . . , of order k defined in formula (1.1) with this function /. 
Then the [/-statistics I n ,k(f) are degenerate, the normalized [/-statistics n~ k / 2 I njk (f) 
converge in distribution to the multiple Wiener-Ito integral J M) fc(/) introduced in Ex- 
ample 2 (with the same measure \x and function / which appears in the definition of 
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In,k(f)) as 71 ~~ *■ 00 ' ( see e -S- W)' an< ^ this Wiener-Ito integral satisfies relation (1.7). 
If the supremum of the function / is bounded by 1, then Theorem 3 can be applied for 
the [/-statistics k\n~ k ^ 2 I nj k(f) : and the above considerations indicate that for not too 
large values u the estimate (1.8) is sharp. 

Our goal was to find such an estimate about the distribution function of degenerate 
[/"-statistics which is asymptotically optimal for not too large values of this function. 
Inequality (1.8) has this property. In this respect it is similar to Bernstein's inequality 
which yields such an estimate in the special case k = 1. On the other hand, inequal- 
ity (1.2) proved in [1], does not supply such a bound. Moreover, the method of paper [1] 
seems not to be strong enough to yield such an estimate. I return to this question later. 

Let us remark that relation (1.2) yields a bound for the tail-distribution of a de- 
generate [/"-statistic for all numbers u > 0, while formula (1.8) holds under the con- 
dition < u < n k / 2 a k+1 . Nevertheless, formula (1.8) implies an estimate also for 
u > n k / 2 - k + 1 which is not weaker than the estimate (1.2) (at least if we do not fix the 
universal constants in these estimates). To see this let us first observe that in the case 
n k/2 > u y n k/2 a k+i re i a ti on (1,8) holds with a = (un~ k ^ 2 ) 1 ^ k+1 and it yields that 

P(k\n- k / 2 \I n , k (f)\ >u)< Aexpj-^^ (|) 2/fc } = Ae-^ 1/(k+1) /^) 1/k . On 

the other hand, a 2 / k (l + c 3 (wi"^"^ 1 )) 2 ^^) > c 3 u 2k : > '( k +V 'n" 1 > /k , hence the 

right-hand side of (1.2) can be bounded from below by Cl e _c 2( u2n ) 1/<fc+1) /c3_ Thus rela- 
tion (1.8) implies relation (1.2) if n fc / 2 > u > n k / 2 (J k + 1 with possibly worse constants 
ci = A, c 2 and c 3 = 2c 2 (l + B) 1 ^. If u > n k / 2 , then the left-hand side of (1.2) equals 
zero because of the boundedness of the function /, and relation (1.2) clearly holds. 

Theorem 3 shows some analogy with large deviation results about the average 
of iid. random variables. If we fix some number larger than the expected value of 
the average of some iid. random variables, then by the large deviation theory this 
average can be larger than this number only with exponentially small probability. The 
term in the exponent of the formula expressing this probability strongly depends on 
the distribution of the random variables whose average is taken. But if the above 
probability is considered at a level only slightly greater than the expectation of the 
average, then this term in the exponent can be well approximated by the value suggested 
by the central limit theorem. A similar result holds for the distribution of normalized 
degenerate [/-statistics, n~ k ^ 2 k\I nj k(f)- In the case < u < const. n fc//2 cr fc+1 , with 
a 2 = Ef 2 (£i, . . . , we can get a large deviation type estimate for the probability 
P(n~ k / 2 k\I n ^{f) > u). If < u < en k / 2 a k+1 with a small e > 0, then we can say 
more. In this case such an estimate can be given which is suggested by the behaviour 
of appropriate non-linear functionals of Gaussian processes. 

Let me also remark that in the case u ^> n k / 2 a k+1 formula (1.8) (or (1.2)) yields 
only a rather weak estimate for the probability P{n~ k / 2 k\I nj k(f) > u) for a degenerate 
[/-statistic of order k with a kernel function / satisfying relation (1.3) and (1.4). The 
weakness of our estimate in this case has a deeper cause. In Examples 3.3 and 8.6 
of the work [10] I have presented such examples for degenerate [/-statistics of order 1 
or 2 with a kernel function / satisfying relations (1.3) and (1.4) for which the lower 
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bounds Pin-^In^f) > u) > exp {-An^ulog and P(n- 1 / n , 2 (/) > u) > 

exp {-An 1/3 -u 2/3 log (^)} hold if B 2 n 1/2 > u > B x n x l 2 (j 2 or B 2 n > u > B^a 3 



respectively with a sufficiently large B\ > and some appropriate < B 2 < 1. Similar 
examples of degenerate [/-statistics could also be constructed for any order k. Thus 
there are such degenerate [/-statistics of order k with a kernel function satisfying relation 
(1.3) and (1.4) with some a > 0, whose tail distribution have an essentially different 
behaviour for u < n k / 2 a k+1 and u > n k / 2 a k+1 . 

There is another sort of interesting generalization of Bernstein's inequality. I would 
refer to a recent work of C. Houdre and P. Reynaud-Bouret [5], where good estimates 
are given for the distribution of degenerate [/-statistics of order 2, but in that paper a 
more general model is considered. It deals with a natural object we can call generalized 
[/-statistic in the special case k = 2. Generalized [/-statistics can be defined similarly 
to classical [/-statistics with the difference that the underlying independent random 
variables £i,£2> • • • m &y be not identically distributed, and the terms in the sum (1.1) 
are of the form fj 1 ,...,j k {ij 1 , • • • , £j k )- If the functions fj 1 ,...,j k are canonical, then we can 
speak about generalized degenerate [/-statistics. (The notion of canonical functions can 
be generalized to this case in a natural way.) The problem about the distribution of 
generalized degenerate [/-statistics can be considered as a multivariate version of the 
problem about the distribution of sums of independent, but not necessarily identically 
distributed random variables with expectation zero. Here we do not discuss this ques- 
tion, although it is very interesting. The most essential part of this problem seems to be 
to find the right formulation of the estimate we have to prove. A good estimate on the 
distribution of generalized degenerate [/-statistics has to depend beside the variance of 
the [/-statistics on different quantities which still should be found. 

It is natural to expect that generalized degenerate [/-statistics I n ,k(f) of order k 
(without normalization) satisfy the inequality 



with some universal constants A = A(k) > and C = C(k) > in a relatively large 
interval for the parameter u, where V 2 denotes the variance of / n ,fe(/)- An essential 
problem is to find a relatively good constant C and to determine the interval < u < D n , 
where the estimate (1.9) holds. The result of this paper states that in the case of classical 
degenerate [/-statistics (1.9) holds in the interval [0, D n ] with D n = const. n k a k+1 , 
where a 2 = Ef(^i,...,^) 2 . For k = 1 this means that relation (1.9) holds in the 
interval < u < V 2 . But it is not clear what corresponds in the general case to the 
right end-point D n = const. n k a k+1 of the interval where the estimate (1.9) should hold. 

This paper consists of six sections and an Appendix. In Section 2 the method of the 
proofs is explained. Our results will be proved by means of a good estimate on high (but 
not too high) moments of the random variables we are investigating. These estimates 
are obtained by means of a diagram formula which enables us to express product of 
stochastic integrals or degenerate [/-statistics as a sum of such expressions. Section 3 
contains the proof of Theorem 1. We formulate a version of the diagram formula about 




(1.9) 
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the product of two degenerate [/-statistics in Section 4. In Section 5 this result will be 
generalized to the product of L > 2 degenerate [/-statistics, and an estimate is given 
about the L2-norm of the kernel functions appearing in the [/-statistics of this result. 
Theorem 3 will be proved in Section 6. The diagram formula about the product of two 
degenerate [/-statistics is proved in the Appendix. 

2. The idea of the proof. 

Theorem 1 will be proved by means of the following 

Proposition A. Let the conditions of Theorem 1 be satisfied for a multiple Wiener-Ito 
integral J^,k(f) of order k. Then, with the notations of Theorem 1, the inequality 

E (k\\J^ k (f)\) 2M < 1 ■ 3 ■ 5 ■ ■ ■ (2kM - l)a 2M for all M = 1,2, .. . (2.1) 

holds. 

By the Stirling formula Proposition A implies that 

E{k\\j,M)\) 2M < ^My- 2M < A (kM) kM * 2M (2-2) 

for any A > \f2 if M > Mo = Mq(A). The following Proposition B which will be applied 
in the proof of Theorem 3 states a similar, but weaker inequality for the moments of 
normalized degenerate [/-statistics. 

Proposition B. Let us consider a degenerate U -statistic I n ,k(f) of order k with sample 
size n and with a kernel function f satisfying relations (1.3) and (1-4) with some < 
a 2 < 1. Fix a positive number rj > 0. There exist some universal constants A = A(k) > 
\f2, C = C(k) > and Mo = Mo(k) > 1 depending only on the order of the U -statistic 
In,k(f) such that 

E (n- fc / 2 fc!/ n , fc (/)) < A (1 + C^f kM (i) (kM) kM a™ 

for all integers M such that kM < kM < r/na 2 . 

The constant C = C(k) in formula (2.3) can be chosen e.g. as C = 2^/2 which does not 
depend on the order k of the U -statistic I n ,k(f)- 

Let us remark that formula (2.1) can be reformulated as E(k\\ J M ,/c(/)|) 2M < 
E(ar) k ) 2M : where r\ is a standard normal random variable. Theorem 1 states that 
the tail distribution of k\\J^^(f)\ satisfies an estimate similar to that of cr\r]\ k . This 
will simply follow from Proposition A and the Markov inequality P(k\\J IJij k(f)\ > u) < 
E(k\\j^ y k{f)\) — w ith an appropriate choice of the parameter M. 
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Proposition B states that in the case M < M < ena 



2 the inequality 



E 



(n- k ' 2 k\i n , k u)) < e((i + mW) 



2M 



holds with a standard normal random variable n and a function (3(e), < e < 1 such that 
(3(e) — > if e — > 0, and /5(e) < C with some universal constant C = C(/c) depending only 
on the order k of the [/-statistic for all < e < 1. This means that certain high but not 
too high moments of n~ k / 2 k\I nj k(f) behave similarly to the moments of k\J^ : k(f)- As 
a consequence, we can prove a similar, but slightly weaker estimate for the distribution 
of n~ k / 2 k\I n) k{f) as for the distribution of klJ^^if)- Theorem 3 contains the result we 
can get about the distribution of I n ,k(f) by means of these moment estimates. 

The proof of Proposition A is based on a corollary of a most important result about 
Wiener-Ito integrals called the diagram formula. This result enables us to rewrite the 
product of Wiener-Ito integrals as a sum of Wiener-Ito integrals of different order. It 
got the name 'diagram formula' because the kernel functions of the Wiener-Ito integrals 
appearing in the sum representation of the product of Wiener-Ito integrals are defined 
with the help of certain diagrams. As the expectation of a Wiener-Ito integral of order 
k is zero for all k > 1 the expectation of the product equals the sum of the constant 
terms (i.e. of the integrals of order zero) in the diagram formula. We shall see that 
Proposition A can be proved relatively simply by means of this corollary of the diagram 
formula. 

We shall also see that there is a version of the diagram formula which enables us 
to express the product of degenerate [/-statistics as a sum of degenerate [/-statistics of 
different order. Proposition B can be proved by means of this version of the diagram 
formula similarly to the proof of Proposition A. The main difference between their proof 
is that in the case of the diagram formula for degenerate [/-statistics some new diagrams 
also appear, and their contribution also has to be estimated. It will be shown that if 
not too high moments of [/-statistics are calculated by means of this new version of the 
diagram formula, then the contribution of the new diagrams is not too large. 

The proof of formula (1.2) in [1] also contains the proof of the inequality 



with some appropriate constant C = C(k) for M < na 2 in an implicit way. This 
estimate is sufficient to the prove relation (1.2), but insufficient to prove Theorem 3. 
In this case we need such a sharpened version of inequality (2.4) which contains an 
asymptotically optimal constant C if M < ena 2 with a small coefficient e > 0. But the 
method of paper [1] is not strong enough to prove such a sharpened version of (2.4). 

One reason for this weakness of the method of paper [1] is that it applies a con- 
sequence of Borell's inequality which does not give a sharp inequality. Nevertheless, 
this inequality could be improved. (See my paper [9].) Another problem is that the 
proof in [1] contains a decoupling argument of paper [2] . This argument which is needed 
to apply a multivariate version of the Marcinkiewicz-Zygmund inequality also weakens 




(2.4) 
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the universal constants in formula (1.2). This difficulty could also be overcome by some 
clever tricks. But the application of the Marcinkiewicz-Zygmund inequality does not 
allow to prove relation (2.4) with an optimal constant C. The proof of this inequality 
is based on a symmetrization argument which implies in particular, that the moments 
of n~ k / 2 k\I nt k(f) are bounded by the (same) moments of a random variable with a 
constant times greater variance. The influence of this too large variance is inherited in 
all subsequent estimates, and as a consequence, a method applying a symmetrization 
argument cannot yield the estimate (2.4) with a sharp constant C. 

3. The proof of Theorem 1. 

To formulate the corollary of the diagram formula we need in the proof of Proposition A 
first I introduce some notations. 

Let us have a cr-finite measure fi together with a white noise \iw with counting mea- 
sure fj, on (X, X). Let us consider L real valued functions fi(x\, . . . , x kl ) on (X kl , X kl ) 
such that f ff{x\, . . . , £&,)//( dx\) . . . fj,( dx kl ) < oo, 1 < I < L. Let us introduce the 
Wiener-Ito integrals kiU^ifi) = f fi(xt, . . . , x kl )/i w ( dx ± ) . . . fx w ( dx kl ), 1 < I < L, 

and describe how the expected value E ^Yl h^-J^kiifi)^ can ^ e calculated by means 
of the diagram formula. 

For this goal let us introduce the following notations. Put 

L 

F(x(tj),l < I < L, 1 < j < hi) = JJ/i(a;(i,i), • • .,x { i M) ), (3.1) 

i=i 

and define a class of diagrams r(fci, . . . , /cl) in the following way: Each diagram 7 e 
r(fci, . . . , is a (complete, undirected) graph with vertices 1 < I < L, 1 < 

j < ki, and we shall call the set of vertices with a fixed index / the Z-th row 

of a graph 7 e r(fci, . . . , k^)- The graphs 7 G r(fci, . . . , ki) will have edges with the 
following properties. Each edge connects vertices and (l\f) from different rows, 
i.e. I 7^ /' for the end-points of an edge. From each vertex there starts exactly one edge. 
r(fci, . . . , ki) contains all graphs 7 with such properties. If there is no such graph, then 
r(fci, ...,k L ) is empty. 

L 

Put 2N = k{. Then each 7 e T(ki, . . . , ki) contains exactly N edges. If an edge 
1=1 

of the diagram 7 connects some vertex (l,j) with some other vertex (l\f), I' > /, then 
we call (V ,3') the lower end-point of this edge, and we denote the set of lower end-points 
of 7 by A 1 which has N elements. Let us also introduce the following function a 1 on 
the vertices of 7. Put a 7 (/,j) = if is the lower end-point of an edge, and 

a ~r(lij) = il' 1 f) if (lij) is connected with the point (l'f) by an edge of 7, and (l',f) is 
the lower end-point of this edge. Then we define the function 

F^(x {lJ) , e Ay) = F(x^ m ,l < I < L, 1 < j < h) (3.2) 
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with the function F introduced in (3.1), i.e. we replace the argument atyj) by xyj') in 
the function F if and are connected by an edge in 7, and I' > I. Then we 

enumerate the lower end-points somehow, and define the function B 1 (r), 1 < r < N, 
such that B-y(r) is the r-th lower end-point of the diagram 7. Write 

F 1 (x 1 ,...,x N ) = F 1 (x B ^ r ), l<r<N) 

and 



= J ■ • -J F 7 (xi, . . . ,XN)fJ>{dxi) . . ./i(dxiv) for all 7 e r(fci, . . . , kz,). (3.3) 

(The function -F 7 (xi, . . . ,xn) depends on the enumeration of the lower end-points of 
the diagram 7, but its integral F 1 is independent of it.) 

We shall need the following corollary of the diagram formula. 
Theorem A. With the above introduced notation 

e (f[ki\j^ M (fi)\ = Yl *v ( 3 - 4 ) 

v=i / 7er(fci,...,fei) 

(IfT(ki, . . . , kij) is empty, then the expected value of the above product of random inte- 
grals equals zero.) Beside this 

L 

F i - II / fH x ii---i x k l )(J>(<fai)---K dx h) for all 7 eT(k u . . . ,k L ). (3.5) 



1=1 



The proof of Theorem A can be found in Corollary 5.4 of [7] or [6]. The result 
of [7] actually deals with a different version of Wiener-Ito integrals where their 'Fourier 
transforms' are considered, and we integrate not with respect to a white noise, but with 
respect to its 'Fourier transform'. The results obtained for such integrals are actually 
equivalent to the result formulated in Theorem A. I formulated Theorem A in the present 
form because generally this version of Wiener-Ito integrals is applied in the literature, 
and it can be compared better with the diagram formula for the product of degenerate 
[/-statistics applied in this paper. Paper [6] contains the diagram formula for the version 
of Wiener-Ito integrals considered in this paper. The result of Theorem A which is not 
contained explicitly in [6] can be deduced from the diagram formula proved in [6] in 
the same (simple) way as Corollary 5.4 is proved in [7]. Now we turn to the proof of 
Proposition A. 

Proof of Proposition A. Proposition A can be simply proved with the help of Theorem A 
if we apply it with L = 2M, and the functions fi(xi, . . . , x^) = f(%i, • • • , Xk) for all 
1 < I < 2M. Then Theorem A yields that 

E (k\J^ k (f) 2M ) < ( [ f 2 (x 1 ,...,x k ) f i(dx 1 )...v(dx k )) \T 2M (k)l 
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where |r 2 M(^)| denotes the number of diagrams 7 in T(k, . . . , k). Thus to complete the 

2M times 

proof of Proposition A it is enough to show that |T2M (&) | < 1 • 3 • 5 • • • (2kM — 1). But 
this can be seen simply with the help of the following observation. Let T2m(^) denote 
the class of all graphs with vertices 1 < I < 2M, 1 < j < k, such that from all 

vertices (/, j) exactly one edge starts, all edges connect different vertices, but we also 
allow edges connecting vertices and with the same first coordinate /. Let 

\^2M(k)\ denote the number of graphs in T2M(k). Then clearly |T2m(^)| < \^2M(k)\. 
On the other hand, |T2m(^)| = 1 • 3 • 5 • • ■ (2kM — 1). Indeed, let us list the vertices of 
the graphs from P 2 m(&) in an arbitrary way. Then the first vertex can be paired with 
another vertex in 2kM — 1 way, after this the first vertex from which no edge starts can 
be paired with 2kM — 3 vertices from which no edge starts. By following this procedure 
the next edge can be chosen 2kM — 5 ways, and by continuing this calculation we get 
the desired formula. 

Proof of Theorem 1. By Proposition A, formula (2.2) and the Markov inequality we 
have 

P(\k\J^ k (f)\ >u)< < A eu2/k j (3.6) 

with some constant A > \[2 if M > M with some constant M = M (A), and M is an 
integer. 

Put M = M(u) = ± (^) 2/fc , and M = M(u) = [M], where [x] denotes the integer 

part of a real number x. Choose some number uq such that ^ (^J > M + 1. Then 
we can apply relation (3.6) with M = M(u) for u > uq, and it yields that 



P{\k\J,, k {f)\ >u)<A (^i-J < e~ kM < Ae k e~ kM 
= Ae k exp (~) j tfu>u . 



(3.7) 



Relation (3.7) means that relation (1.6) holds for u > u$ with the pre-exponential 
coefficient Ae k . By enlarging this coefficient if it is needed we can guarantee that 
relation (1.6) holds for all u > 0. Theorem 1 is proved. 
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4. The diagram formula for the product of two degenerate [/-statistics. 

To prove Proposition B we need a result analogous to Theorem A about the expectation 
of products of degenerate [/-statistics. To get such a result first we describe the product 
of two degenerate [/-statistics as the sum of degenerate [/-statistics of different order 
together with a good estimate on the Z/2-norm of the kernel functions in the sum repre- 
sentation. The proof of this result will be given in the Appendix. We can get with the 
help of an inductive procedure a generalization of this result. It yields a representation 
of the product of several degenerate [/-statistics in the form of a sum of degenerate 
[/-statistics which implies a formula about the expected value of products of degenerate 
[/-statistics useful in the proof of Proposition B. This generalization will be discussed 
in the next section. 

Let us have a sequence of iid. random variables £i,£2 5 --- with some distribu- 
tion n on a measurable space (X,X) together with two functions f(x\, . . . , x^) and 
g(xi, . . . ,Xk 2 ) on (X kl , X kl ) and on (X k2 , X k2 ) respectively which are canonical with 
respect to the probability measure fx. We consider the degenerate [/-statistics I n ,kAf) 
and I nj k 2 (g) and express their normalized product ki\k2 } .n~( kl+k2 ^ 2 I nj k 1 (f)I n ,k 2 (g) as 
a sum of (normalized) degenerate [/-statistics. This product can be written as a sum 
of [/-statistics in a natural way, and then by applying the Hoeffding decomposition for 
each of these [/-statistics as a sum of degenerate [/-statistics we get the desired repre- 
sentation of the product of two degenerate [/-statistics. The result we get in such a way 
will be presented in Theorem B. Before its formulation I introduce some notations. 

To define the kernel functions of the [/-statistics appearing in the diagram formula 
for the product of two [/-statistics first we introduce a class of objects T(ki,k 2 ) we 
shall call coloured diagrams. We define graphs 7 G T(ki,k2) that contain the vertices 
(1, 1), (1, 2), . . . , (1, k\) which we shall call the first row and (2,1)..., (2, k 2 ) which we 
shall call the second row of these graphs. From each vertex there starts zero or one 
edge, and all edges connect vertices from different rows. All edges will get a colour +1 
or —1. r(fci,/c2) consists of all 7 obtained in such a way which we shall call coloured 
diagrams. 

Given a coloured diagram 7 G Y{ki,k2) let B u (j) denote the set of upper end- 
points (1, j) of the edges of the graph 7, B^ ^(^f) the set of lower end-points (2, j) of 
the edges of 7 with colour 1, and 5(&,-i)(7) the set of lower end-points (2,j) of the 
edges of 7 with colour —1. (The letter 'b' in the index was chosen because of the word 
below.) Finally, let ^(7) denote the set of edges with colour 1, W{^) the set of edges 
with colour —1 of a coloured graph 7 G r(fci,/c 2 ), and let |Z(7)I an d |W(7)| denote 
their cardinality. 

Given two functions f(x±, . . . , Xk ± ) and g(x±, . . . , Xk 2 ) let us define the function 

(/ #)(Z(1,1)> • • • ' a: (l,fci)» X (2,l)> ■ • • > X (2,k 2 )) = f( x (l,l)i • • • > X (1M))9( X (2,1), • • • , x (2,k 2 )) 

(4.1) 

Given a function h(x Ul , . . . , x Ur ) with coordinates in the space (X, X) (the indices 
u\ , . . . , u r are all different) let us introduce its transforms P Uj h and Uj h by the formulas 



P Uj h(x Ul : ui G {tii, ...,u r }\ {uj}) = 




...,x Ur )n(dx U:j ), l<j<r, (4.2) 
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and 



Q Uj h(x Ul , . . . , x Ur ) = h(x Ul ,...,x Ur ) -J h(x Ul ,...,x Ur )n{dx u .), l<j<r. (4.3) 

At this point I started to apply a notation which may seem to be too complicated, 
but I think that it is more appropriate in the further discussion. Namely, I started 
to apply a rather general enumeration u±, . . . ,u r of the arguments of the functions we 
are working with instead of their simpler enumeration with indices 1, . . . , r. But in the 
further discussion there will appear an enumeration of the arguments by pairs of integers 
(/, j) in a natural way, and I found it simpler to work with such an enumeration than 
to reindex our variables all the time. Let me remark in particular that this means that 
the definition of the [/-statistic with a kernel function f(x\, . . . , Xk) given in formula 
(1.1) will appear sometimes in the following more complicated, but actually equivalent 
form: We shall work with kernel function f(x Ul , . . . , x Uk ) instead of f(x\, . . . , Xk), the 
random variables £j will be indexed by u s , i.e. to the coordinate x Ug we shall put the 
random variables £j with indices 1 < j Us < n, and in the new notation formula (1.1) 
will look like 

w/) = ^ E / (i-i') 

l<Ju s < n , s=l,...,k 

Let us define for all coloured diagrams 7 G Y{k\,k2) the function a 7 (l,j), 1 < 
j < k±, on the vertices of the first row of 7 as a 7 (l,j) = (1, j) if no edge starts from 
(l,j), and a 7 (l,j) = (2,f) if an edge of 7 connects the vertices and (2,j'). 

Given two functions f(xi, . . . , x^) and g(xi, . . . , Xk 2 ) together with a coloured diagram 
7 G r(fci,/c2) let us introduce, with the help of the above defined function cc 7 (-) and 
(/ o g) introduced in (4.1) the function 



(f°9\(x(i,j),X(2,j'),j G {l,...,/ci}\5 M ( 7 ), 1 <f < k 2 ) ^ 

= (/ g)(x a ^(l,l), ■ ■ ■ ,Xa y (lM)' X (2,l)> • • • » x (2,k 2 ))- 

(In words, we take the function {fog), and if there is an edge of 7 starting from a 

vertex (1, j), and it connects this vertex with the vertex (2,f), then the argument 

is replaced by the argument X(2,j') m this function.) Let us also introduce the function 

(/ 0)7 {x(i,j),X( 2J ,), j G {1, . . . , fei} \ B u (-y), j' G {1, . . . , k 2 } \ %,i)) 

n p (^') n (45) 

(2,j')6S( M )(t) (2,i')£%,-i)(7) 



(/°^) 7 (^(j'.i)' x 0"'.2)» j e {1,. 5 «(7), 1 < / < ^2) 



(In words, we take the function (/ o gr)^ and for such indices (j', 2) of the graph 7 from 
which an edge with colour 1 starts we apply the operator P{2,y) introduced in formula 
(4.2) and for those indices (2, j') from which an edge with colour —1 starts we apply the 
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operator Q(2,j') defined in formula (4.3).) Let us also remark that the operators P(2,j') 
and Q(2,j') are exchangeable for different indices j' , hence it is not important in which 
order we apply the operators P(2,j') an d Q(2,j') m formula (4.5). 

In the definition of the function (/ o g) 1 those arguments £(2,j') of the function 
(/ ° g) 7 which are indexed by a pair (2, j') from which an edge of colour 1 of the coloured 
diagram 7 starts will disappear, while the arguments indexed by a pair (2,f) from which 
an edge of colour —1 of the coloured diagram 7 starts will be preserved. Hence the 
number of arguments in the function (/ og) 7 equals k\ + Ji2 — 2 1 .£?({,, 1) (7)! — (7-) | , 

where 1-8(6,1) (7)] and |-B(6,-i)(7)| denote the cardinality of the lower end-points of the 
edges of the coloured diagram 7 with colour 1 and —1 respectively, In an equivalent form 
we can say that the number of arguments of (fog) 7 equals k\ + /C2 — (2|Z(7)| + |W(7)|). 

Now we are in the position to formulate the diagram formula for the product of 
two degenerate [/-statistics. 

Theorem B. Let us have a sequence of iid. random variables ^1,^2, ••• with some 
distribution \i on some measurable space (X, X) together with two bounded, canonical 
functions /(xi, . . . , x^) and g(xi, . . . , Xf- 2 ) with respect to the probability measure fi on 
the spaces (X kl , X kl ) and (X k ,X k2 ). Let us introduce the class of coloured diagrams 
T(ki, ^2) defined above together with the functions (/ o g) 1 defined in formulas (4-1)— 
(4-5). 

For all 7 G T the function (/ o g) 7 is canonical with respect to the measure \i with 
£(7) = ki + k2 — (2\Z('j) | + |W(7) |) arguments, where \Z(-f)\ denotes the number of edges 
with colour 1 and | W(7)| the number of edges with colour —1 of the coloured diagram 7. 
The product of the degenerate U -statistics / n ,fci(/) an d I n ,k 2 (g)> n > max(/ci, /c 2 ), de- 
fined in (1.1) satisfies the identity 

\Z(j)\ 

II (n-(k 1 + k2) + \W( 1 )\ + \Z( 1 )\+j) 

= E — < 4 - 6 > 

7er(fei,fe 2 ) 

n-|^)l/2. M7)!n -MT)/2 Jr[ife(7)((/o ^ 7)) 
where means that summation is taken only for such coloured diagrams 7 G 

1^(7)1 

r(fci, ^2) which satisfy the inequality k\ + fc 2 — (|Z(7)| + |W(7)|) < n, and | f equals 

i-i 

1 m £/ie case |Z(7)| = 0. 

TTie L2-norm of the functions (/ ° <?) 7 is defined by the formula 

0)7ll2 =J(f°9)y(x {hj) ,x { 2,r),je {l,...,fei}\5 tt (7),/ G {l,...,fc 2 }\S(6,i)) 

n ^(^(i,j)) n v{dx { 2,y)). 

(1J): je{l,...,fci}\B„(7) (2,j'): i'e{l,... I fc 2 }\B (6 , 1) 
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If W{pi) = 0, then the inequality 

||(/o<7) 7 || 2 < H/H2NI2 if 1^(7)1 = 0. (4.7) 

holds. In the general case we can say that if the functions f and g satisfy formula (1.3), 
then also the inequality 

||(/o^) 7 || 2 <2l^)lmin(||/|| 2 ,|b|| 2 ) (4.8) 

holds. Relations (4-7) and (4-8) remain valid even if we drop the condition that the 
functions f and g are canonical. 

Relations (4.7) and (4.8) mean in particular, that we have a better estimate for 
IK/ 9O7II2 in the case when the coloured diagram 7 contains no edge with colour — 1, 
i.e. |W(7)| = 0, than in the case when it contains at least one edge with colour —1. 

Let us understand how we define those terms at the right-hand side of (4.6) for 
which k(j) = 0. In this case (fog) 1 is a constant, and to make formula (4.6) meaningful 
we have to define the term I n ,k{-i){{f g)j) also in this case. The following convention 
will be used. A constant c will be called a degenerate [/-statistic of order zero, and we 
define I n ,o(c) = c. 

Theorems B can be considered as a version of the result of paper [8], where a 
similar diagram formula was proved about multiple random integrals with respect to 
normalized empirical measures. Degenerate [/-statistics can also be presented as such 
integrals with special, canonical kernel functions. Hence there is a close relation between 
the results of this paper and [8]. But there are also some essential differences. For 
one part, the diagram formula for multiple random integrals with respect to normalized 
empirical measures is simpler than the analogous result about the product of degenerate 
[/-statistics, because the kernel functions in these integrals need not be special, canonical 
functions. On the other hand, the diagram formula for degenerate [/-statistics yields 
a simpler formula about the expected value of the product of degenerate [/-statistics, 
because the expected value of a degenerate [/-statistic equals zero, while the analogous 
result about multiple random integrals with respect to normalized empirical measures 
does not hold. Another difference between this paper and [8] is that here I worked out 
a new notation which, I hope, is more transparent. 
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5. The diagram formula for the product of several degenerate [/-statistics. 

We can also express the product of more than two degenerate [/-statistics in the form 
of sums of degenerate U -statistics by applying Theorem B recursively. We shall present 
this result in Theorem B' and prove it together with an estimate about the Z/2-norm 
of the kernel functions of the degenerate [/-statistics appearing in Theorem B'. This 
estimate will be given in Theorem C. Since the expected value of all degenerate U- 
statistics of order k > 1 equals zero, the representation of the product of [/"-statistics 
in the form of a sum of degenerate [/-statistics implies that the expected value of this 
product equals the sum of the constant terms in this representation. In such a way 
we get a version of Theorem A for the expected value of a product of degenerate U- 
statistics which together with Theorem C will be sufficient to prove Proposition B. But 
the formula we get in this way is more complicated than the analogous diagram formula 
for products of Wiener-Ito integrals. To overcome this difficulty we have to work out a 
good "book-keeping method" . 

Let us have a sequence of iid. random variables £1, £2? • • • taking values on a mea- 
surable space (X, X) with some distribution /z, and consider L functions fi(xi, . . . , Xk t ) 
on the measure spaces (X l ,X l ), 1 < I < L, canonical with respect to the mea- 
sure fx. We want to represent the product of L > 2 normalized degenerate [/-statistics 
n~ kl ^ 2 ki\I n ^i{fki) m the form of a sum of degenerate [/-statistics similarly to The- 
orem B. For this goal I define a class of coloured diagrams T(k\,...,kL) together 
with some canonical functions F 1 = F 1 (fk 1 , . . . , fk L ) depending on the diagrams 7 G 
r(fci, . . . , Icl) and the functions fi(x±, . . . , Xk t ), 1 < I < L. 

The coloured diagrams will be graphs with vertices and (l,j,C), 1 < I < L, 
1 < j < fa, and edges between some of these vertices which will get either colour 1 or 
colour — 1. The set of vertices {(I, j), C), 1 < j < k(\ will be called the /-th row of 
the diagrams. (The vertices C) are introduced, because it turned out to be useful 
to take a copy C) of some vertices (I, j). The letter C was just chosen to indicate 
that it is a copy.) From all vertices there starts either zero or one edge, and edges 
may connect only vertices in different rows. We shall call all vertices of the form 
permissible, and beside this some of the vertices (/, j, C) will also be called permissible. 
Those vertices will be called permissible from which some edge may start. 

We shall say that an edge connecting two vertices with (£2, J2) or (a permis- 

sible) vertex (h,ji,C) with another vertex (/27J2) such that I2 > l\ is of level Z2, and 
(h,j) will be called the lower end-point of such an edge. (The coloured diagrams we 
shall define contain only edges with lower end-points of the form We shall call 

the restriction of the diagram 7 to level I that part of a diagram 7 which contains 
all of its vertices together with those edges (together with their colours) whose levels 
are less than or equal to /, and tells which of the vertices (l',j,C) are permissible for 
1 < V < I. We shall define the diagrams 7 G T(ki, . . . , ki) inductively by defining 
their restrictions 7(/) to level / for all / = 1, 2, . . . , L. Those diagrams 7 will belong to 
r(fci, . . . , ki) whose restrictions can be defined through the following procedure for 
alU = 1,2,. ..,L. 

The restriction 7(1) of a diagram 7 to level 1 contains no edges, and no vertex of 
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the form (1, j, C), 1 < j < k\, is permissible. If we have defined the restrictions 7(7 — 1) 
for some 2 < I < L, then those diagrams will be called restrictions 7(7) at level / which 
can be obtained from a restriction 7(7 — 1) in the following way: Take the vertices 
1 < j < hi, from the /-th row and from each of them either no edge starts or one edge 
starts which gets either colour 1 or colour —1. The other end-point must be such a 
vertex or a permissible vertex (l',j',C) with some 1 < V < I which is not an 

end-point of a vertex in 7(7 — 1), and naturally such a vertex can be connected only 
with one of the vertices (7, j), 1 < j < k\. We define 7(7) first by adjusting the coloured 
edges constructed in the above way to the (coloured) edges of 7(7 — 1), and the set of 
permissible vertices in will contain beside the permissible vertices of 7(7 — 1) and the 
vertices (7, j), 1 < j < ki, those vertices (l,j,C) for which is the lower end-point 
of an edge with colour —1 in 7(7). r(fci, . . . , /c^) will consist of all coloured diagrams 
7 = 7(7v) obtained in such a way. 

Given a coloured diagram 7 G r(fci, . . . , ki) we shall define recursively some (canon- 
ical) functions F[ n with the help of the functions fi, ■ ■ ■ , fi together with some constants 
J n (l,j) for all 1 < I < L in the way suggested by Theorem B. Then we put F 1 = Fl^ 
and give the desired representation of the product of the degenerate [/-statistics with the 
help of [/-statistics with kernel functions F 1 and constants J n (/, 7), 7 G T(ki, . . . , fc^), 
1 < I < L. 

Let us fix some coloured diagram 7 G r(/ci, . . . , ki) and introduce the following 
notations: Let B^_^(l,j) denote the set of lower end-points of the form of edges 
with colour —1 and -B(b,i) (1, 7) the set of lower end-points of the form (l,j) with colour 1. 
Let [/"(/, 7) denote the set of those permissible vertices and (l',j,C) with < I 

from which no edge starts in the restriction ^y(l) of the diagram 7 to level I, i.e. either no 
edge starts from this vertex, or if some edge starts from it, then its other end-point is a 
vertex with I' > I. Beside this, given some integer 1 < li < I let U(l, h, 7) denote 

the restriction of U(l,^f) to its first h rows, i.e. C/(Z, £1,7) consists of those vertices 
and (l,j',C) which are contained in U(l, 7), and /' < l\. We shall define the 
functions ^(7) with arguments of the form ^(z'j) and X(i',j,c) with G U(l,j) and 

(l',j,C) G U(l,j) together with some constants J n (l,j). For this end put first 

^1,7(371, i), •••»Z(fci,i)) = /i(x(i,i),...,x (fcl) i)). (5.1) 

To define the function F Z)7 for / > 2 first we introduce a function «;, 7 (-) on the set of 
vertices in U(l — 1, 7) in the following way. If a vertex (/', /) or (l',f, C) in [7(7, 1 — 1) 
is such that it is connected to no vertex 1 < j < ki, then ai^(l',j') = (l',f), 

0^,7(^5 j' 1 C) = (I'lfjC) and if (l',j r ) is connected to a vertex (/, j), then ai ;1 (l',j') = 
if (l',j',C) is connected with a vertex (Z, j), then ai n (l',j',C) = We define, 

similarly to the formula (4.4) the functions 

Fl,-y( x (l',j')> x (l',J',C), (l',f) and (l',j',C) G U(l, I - 1,7), x^-), 1 <j < fcj) 

= -P 1 i-i, 7 (a;a I , 7 (i' ) i')' a; «i,7(i'v7 , ,c ! )> <T>7') and ( l 'J', c ) e C/(Z-1,7)) (5.2) 
/i(aj(i,i)> • • •> a; (i,fci))> 
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i.e. we take the function i^-1,7 fi and replace the arguments of this function indexed 
by such a vertex of 7 which is connected by an edge with a vertex in the Z-th row of 7 
by the argument indexed with the lower end-point of this edge. 

Then we define with the help of the operators P Uj and Q Uj introduced in (4.2) and 
(4.3) the functions 

Fi,i( x (i',j')i x (i',j',c)i and C) e U(l,l - 1,7), 

n p (u) n 

C,j')eB (6il) (i,7) (i,j)eB( 6 ,-i)(i,7) 

^I,7( a; (I',j')' X (i , .J , .C ! )> and ( l 'j': C ) e ^(^ ^ — 1,7), ^(Ij), 1 - 

(5.3) 

similarly to the formula (4.5), i.e. we apply for the function ^(7) the operators P(ij) 
for those indices which are the lower end-points of an edge with colour 1 and the 
operators Q(ij) for those indices which are the lower end-points of an edge with 
colour —1. 

Finally we define the function Fi tl simply by reindexing some arguments of the 
function Fi :1 to get a function which is indexed by the vertices in U(l,^y). To this end 
we define the function Aj j7 (-) on the set of vertices {(I, j) : € {(1, 1), . . . , (/, ki)} \ 
B (bA) (l, 7) as Ai n (l,j) = (l,j,C) if E S (6 7), and Aj i7 (Z,j) = (I J) if e 
{(/, 1), . . . , (Z, ki)} \ (B (M) (Z, 7) U S(6,-i)(i, 7))- Then we put 

ir i,7( a; (i , ,i')' a; (i'.i'.c)> and ( l 'ifi c ) e u ( l ,l)) 

= F l ,i(x(i>,j>),X( V , j ,,c), (I'J') and (Z',j,C) G f7(Z,Z-l, 7 ), (5-4) 
XA ln (ij), e {(Z,l),...,(Z,/c,)}\ B (M) (Z, 7)). 

We define beside the functions F 7 = Fl )7 the following constants J n (/,7), 1 < I < 
L: J n (l,7) = 1) and 

|S( 6 ,i)(I,7)l 

fi (fci + fe) + |5 (6 ,-i)(Z,7)l + |S (M) (Z,7)| + J') 

Jn(Z,7) = ^^ , 2<Z<L, 

(5.5) 

if |E (bj i)(Z,7)| > 1, and J n {hl) = 1 if |.B(6,i)(Z, 7)! = 0, where \B( bjl )(l, j)\ and 
|S(6 5 _i)(Z, 7)| denote the number of those edges in 7 with colour 1 and with colour 
— 1 respectively whose lower end-point is in the Z-th row of 7. 

Now we can formulate the following generalization of Theorem B. 

Theorem B'. Let us have a sequence of iid. random variables £i,£2>--- with some 
distribution \x on a measurable space (X, X) together with L > 2 bounded functions 
fi(x±, . . . ,Xk t ) on the spaces (X kl , X kl ), 1 < Z < L, canonical with respect to the prob- 
ability measure fx. Let us introduce the class of coloured diagrams T(ki, . . . , ki) defined 
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above together with the functions F 1 = -Fl i7 (/i, . . . , fi) defined in formulas (5.1) — (5.4) 
and the constants J n (l,j), 1 < I < L given in formula (5.5). 
i i 

Put fc(7(Z)) = E k P - J2( 2 \ B (b,i)(P,l)\ + |-B(6 _i)(p, 7)|), where \B {bA) (p,^)\ de- 

p=l p=2 

notes the number of lower end-points in thep-th row 0/7 with colour 1 and |-B(6,-i)(P) 7)1 
is the number of lower end-points in the p-th row 0/7 with colour —1, 1 < I < L, and 
define £(7) = k(j(L)). Then fc(7(/)) is t/ie number of variables of the function F[ n , 

1 < I < L. 

The functions F 1 are canonical with respect to the measure [i with £(7) variables, 
and the product of the degenerate U -statistics I n ,ki(f), n > ^f^ L ^h defined in (1.1) 

satisfies the identity 

i=i 7er(fci,...,fci) v=i / 

(5.6) 

L 

where \W(^)\ = ^ 1-8(6,-1) (Z, 7)] is £/ie number of edges with colour — 1 in i/ie coloured 
1=2 

diagram 7, and ^'( n ' L ) means that summation is taken for those 7 G r(fci, . . . , &£,) 
which satisfy the relation k{^{l — 1)) + ki — (|-B(6,i)(Z, 7)! + |fi(6 ) _i)(Z, 7)]) < n /or aZZ 

2 < I < L. 

Let T(k\, . . . , ki) denote the class of those coloured diagrams of T(ki, . . . , /cl) for 
which every permissible vertex is the end-point of some vertex. A coloured diagram 
7 G T(ki, . . . , ki) satisfies the relation 7 G f (k\, . . . , ki) if and only if £(7) = 0. 
In this case F 1 is constant, and I n; / C ( 7 )(F 7 ) = F 1 . For all other coloured diagrams 
7 G T(/ci, . . . , ki) k(-y) > 0. The identity 

s(n fc ' ! »" fc,/2j ».*.(/*.)) = £ /(n ' L) (flUln))n-^ 2 -F, (5.7) 
v=i / 7er(fci,...,As I ,) V/=i / 

holds. 

Theorem B' can be deduced relatively simply from Theorem B by induction with 
respect to the number L of the functions. Theorem B contains the results of Theorem B' 
in the case L = 2. A simple induction argument together with the formulas describing 
the functions F/ i7 by means of the functions -Fz-i, 7 and fi and Theorem B imply that 
all functions F^ in Theorem B' are canonical. Finally, an inductive procedure with 
respect to the number L of the functions fi shows that relation (5.6) holds. Indeed, 
by exploiting that formula (5.6) holds for the product of the first L — 1 degenerate 
[/-statistics, then multiplying this identity with the last [/-statistic and applying for 
each term at the right-hand side Theorem B we get that relation (5.6) also holds for 
the product L degenerate [/-statistics. 
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A simple inductive procedure with respect to / shows that for all 2 < I < L the 

I i 

diagram contains k(-y(l)) = E ^ ~ E ( 2 \ B (b,i)(p, l) I + |#(6 ,-i)(p, 7)1) permissible 

p=l p =2 

L 

vertices in its first / rows which is not the end-point of an edge in 7(7). Since 7 has E 

p=i 

L 

E |-B(6 7)|) permissible vertices this identity with / = L implies that £(7) = 

if and only if 7 G T(ki, . . . , /cl) with the class of coloured diagrams T(ki, . . . , &l) 
introduced at the end of Theorem B'. Since EI n ^{f) = for all degenerate [/-statistics 
of order k > 1, the above property and relation (5.6) imply identity (5.7). 

In the proof of Proposition B we shall also need an estimate formulated in Theo- 
rem C. It is a simple consequence of inequalities (4.7) and (4.8) in Theorem B. 

Theorem C. Let us have L functions fi(x±, . . . , Xk t ) on the spaces (X kl , X kl ), 1 < I < 
L, which satisfy formulas (1.3) and (1-4) (if we replace the index k by index ki in these 
formulas), but these functions need not be canonical. Let us take a coloured diagram 
7 G r(fci, . . . , &l) and consider the function F 7 = i*z, )7 (/i, . . . , fi) defined by formulas 
(5.1) — (5.5). The L 2 -norm of the function F 1 (with respect to a power of the measure 
[i to the space, where F 1 is defined) satisfies the inequality \\F^\\2 < 2I w/ ( 7 )I<t( l_ ' 7 ( 7 )), 
where \W(~f)\ denotes the number of edges of colour —1, and U(~f) the number of rows 
which contain a lower vertex of colour — 1 in the coloured diagram 7. 

Proof of Theorem C. We shall prove the inequality 

||^, 7 || 2 < 2 |w(z ' 7)l a (z - [/( '' 7)) foraJll <Z<L, (5.8) 

where |W(/,7)| denotes the number of edges with colour 1, and £/"(£, 7) is the number 
of rows containing a lower point of an edge with colour —1 in the coloured diagram 
Formula (5.8) will be proved by means of induction with respect to /. It implies 
Theorem C with the choice / = L. 

Relation (5.8) clearly holds for 1 = 1. To prove this relation by induction with 
respect to I for all 1 < I < L let us first observe that sup 2~\ w ( 1 '^ \Fi^\ < 1 for all 

1 < I < L. This relation can be simply checked by induction with respect to /. 

If we know relation (5.8) for I — 1, then it follows for I from relation (4.7) if 
1-5(6,-1) (1, 7) I = 0, that is if there is no edge of colour —1 with lower end-point in the Z-th 
row. Indeed, in this case ||F I>7 (/i, . . . , fi)\\ 2 < H^-i^lhll/dh < ||F z _i )7 (/i, . . . , //_i) || 2 • 
a, \W(l,j)\ = |W(Z-1,7)|, and U{1^) = U(l-l,j). Hence relation (5.8) holds in this 
case. 

If |S(5 j _i)(Z, 7)| > 1, then we can apply formula (4.8) for the expression ||-F/ )7 ||2 = 

H-f^lh = IK-Pj-1,7 //Wnlh) where is that coloured diagram with two rows whose 
first row consists of the indices of the variables of the function F;_i 7 , its second row 
consists of the vertices 1 < j < h, and j(l) contains the edges of 7 between 

these vertices together with their colour. Then relation (4.8) implies that ||Fi j7 || 2 < 
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2 l^.-i)l||F z _ 1)7 || 2 < 2(l w/ ( z - 1 ^)l+l B (^-i)(^)l)^- 1 - c/ ('- 1 ^) if |B (6 _!)(/, 7 )| > 1. Be- 
side this, \W{1 - 1,7)| + iB^.x)^, 7 )| = \W(l,-y)\, and I - 1 - U(l - 1, 7 ) = I ~ U(l,>y) 
in this case. Hence relation (5.8) holds in this case, too. 

6. The proof of Theorem 3. 

First we prove Proposition B. 

Proof of Proposition B. We shall prove relation (2.3) by means of Theorem C and identity 
(5.7) with the choice L = 2M and fi(x±, . . . , Xk t ) = f(xi, . . . , Xk) for all 1 < I < 2M. 
We shall partition the class of coloured diagrams 7 G T(k,M) = F(k, . . . ,k) with 

2M times 

the property that all permissible vertices are the end-points of some edge to classes 
F(k, M, p), 1 < p < M, in the following way: 7 G F(k,M,p) for a coloured diagram 
7 G r(M, k) if and only if it has 2p permissible vertices of the form C). (A coloured 
diagram 7 G T(k, M) has even number of such vertices.) First we prove the following 
estimate: 

There exists some constant A = A(k) > and threshold index Mq = Mo(k) 
such that for all M > M and < p < kM the cardinality |r(fc, M,p)\ of the 

set F(k,M,p) can be bounded from above by A2 2 P( 2k 2 ™) (f) fcM (kM) kM +P. 

We can bound the number of coloured diagrams in F(k,M,p) by calculating first 
the number of choices of the 2p permissible vertices from the 2kM vertices of the form 
(l,j,C) which we adjust to the 2kM permissible vertices and then by calculating 
the number of such graphs whose vertices are the above permissible vertices, and from 
all vertices there starts exactly one edge. (Here we allow to connect vertices from the 
same row. Observe that by defining the set of permissible vertices C) in a coloured 
diagram 7 we also determine the colouring of its edges.) Thus we get by using the 
argument at the beginning of Proposition A that \T(k, M, p) \ can be bounded from above 
by ( 2 ^) 1 • 3 • 5 • • • (2kM + 2p-l) = ( 2 ^) ^Jg£+g_ We can write by the Stirling 

formula, similarly to relation (2.2) that ^^wh)\ - A {lf M+P (kM + p) kM +P with 
some constant A > \[2 if M > Mo with some Mo = Mq(A). Since p < kM we can 
write (kM +p) kM +P < (kM) kM (1 + ^) kM (2kM)P < (kM) kM +P e P2P. The above 
inequalities imply that 

/2k M\ f2\ kM 

\T(k,M,p)\ <A( J I- J (kM) kM+p 2 2 P ifM>M , (6.1) 
as we have claimed. 

Observe that for 7 G F(k,M,p) the quantities introduced in the formulation of 
Theorems B' and C satisfy the relations |W(7)| = 2p, |F 7 | = ||F 7 || 2 and ^(7) < 
\W(-y) I = 2p. Hence by Theorem C we have n~\ w ^/ 2 \FA < 2P n - p a 2M - u ^ < 

2P (na 2 )~ P a 2M < 7]P2P(kM)-Pa 2M if kM < 7]na 2 and a 2 < 1. 
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This estimate together with relation (5.7) and the fact that the constants J n (l,j) 
defined in (5.5) are bounded by 1 imply that for kM < r\na 2 



2M _ _ kM 

E (n- k ' 2 k\I n , k {f k )) < n~\ w ^ 2 ■ |F 7 | < £ |r(fc, M,p)\ V ^(kM)-Pa 2M . 

7 er(A:,M) P=0 

Hence by formula (6.1) 

kM kM 



E 



2M /0\ /9£>A//"\ 

(n- k / 2 k!I n Afk)) < A (jj (kM) kM a 2M ^ ( 2> /^) 

(9 \ kM / \ 2kM 

-\ (kM) kM a 2M (l + 2v^) 



2p 



if kM < r\na 2 . Thus we have proved Proposition B with C = 2y/2. 

Proof of Theorem 3. We can write by the Markov inequality and Proposition B with 
rj = ^ that 



E(kln- k / 2 I nk (f)) 2I! 
P(k!n- k / 2 \I n , k (f)\ >u)< ^ 



u 

k M 

1 „VkM\ /C7\2A N 



2 \ "•" (6-2) 



<A\--2kM[l + C^\ H 

e \ v ncr / \u/ 

for all integers M > M with some M = M (A). 

We shall prove relation (1.8) with the help of estimate (6.2) first in the case D < 
^ < n k l 2 a k with a sufficiently large constant D = D(k, C) > depending on k and the 
constant C in (6.2). To this end let us introduce the numbers M, 

1 {U\ 2 / k 1 1 / W \2/fc 



2W i | B ^) 1/k 2W l + B(un- k / 2 a-( k + 1 )) 1/k 



with a sufficiently large number B = B(C) > and M = [M], where [x] means the 
integer part of the number x. 

Observe the VkM < (f < (un^/V-f^ 1 ))^ < 1, and 



( 1 + C^B\ 2 < 1 + B^ <1 + B (un-^a-(^) 



l/k 



23 



with a sufficiently large B = B{C) > if ^ < n k l 2 a k . Hence 



-•2W 1 + (-) < - • 2fcM 1 + C— ^ (-) 

e \ v n<T ) \uJ e \ yna J \uJ 

( 1 + ^) 2 ,1 



(6.3) 



if ^ < n k / 2 a k . If the inequality D < ^ also holds with a sufficiently large D = 
D(B, k) > 0, then M > M , and the conditions of inequality (6.2) hold. This inequality 
together with inequality (6.3) yield that 

P{k\n- k ' 2 \I n ^{f)\ >u)< Ae~ kM < Ae k e- k ™ 

if D < ^ < n fc / 2 a fc , i.e. inequality (1.8) holds in this case with a pre-exponential 
constant Ae k . By increasing the pre-exponential constant Ae k in this inequality we get 
that relation (1.8) holds for all < ^ < n k l 2 a k . Thus Theorem 3 is proved. 

Let us observe that the above calculations show that the constant B in formula 
(1.8) can be chosen independently of the order k of the [/-statistics I n ,k(f)- 



Appendix. The proof of Theorem B. 

The proof of Theorem B. Let us consider all possible sets {(u\, u[), . . . , (ui, u[)}, 1 < I < 
mm(k\, k%) containing such pairs of integers for which u s G {1, . . . , k\}, u' 8 G {1, . . . , k 2 }, 
1 < s < I, all points ui, . . . ,ui are different, and the same relation holds for the points 
u[, . . . , u[, too. Let us correspond the diagram containing two rows (1, 1), . . . , (1, k\) 
and (2, 1), . . . , (2, fo) and the edges connecting the vertices (l,u s ) and (2,u' s ), 1 < 
s < I to the set of pairs {(u\, u[), . . . , (ui, u'i)}, and let T(ki, k 2 ) denote the set of all 
(non-coloured) diagrams we can obtain in such a way. Let us consider the product 
ki\In,k 1 (f)k2*In,k 2 (9)i an d rewrite it in the form of the sum we get by carrying out a 
term by term multiplication in this expression. Let us put the terms we get in such a 
way into disjoint classes indexed by the elements of the diagrams 7 G r(ki,k 2 ) in the 
following way : A product f(Cji ? • • • > £j kl ; • • • > £?' ) belongs to the class indexed by 
the graph 7 G T(k u k 2 ) with edges {((1, m), (2, it^)), . . . , ((1, itj), (2, u' t ))} if j Us = j' u ,^ 
1 < s < I, for the indices of the random variables appearing in the above product, and 
no more coincidence may exist between the indices ji, . . . ijk\ij\i ■ • ■ ->3k2- 

With such a 

notation we can write 

n- (fcl+fc2)/2 ^i!Wm 2 !Ws) = E /(n)n " (fcl+fc2)/2fc ^) !/ n^(7)(7^)7), (Al) 

where the functions (/ o g)^) are defined in formulas (4.1) and (4.4). (Observe that 
although formula (4.4) was defined by means of coloured diagrams, the colours played 
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no role it. The formula remains meaningful, and does not change if we replace the 
coloured diagram 7 by the diagram 7 we get by omitting the colours of its edges.) The 
quantity ^(7) equals the number of such vertices of 7 from the first row from which no 
edge starts plus the number of vertices in the second row, and the notation means 
that summation is taken only for such diagrams 7 G T for which n > k(j). 

Let the set V\ = ^.(7) consist of those vertices (l,iti) = (l,iti) 7 , . . . , (l,tt Sl ) = 
(1, u 8l ) 1 of the first row {(1, 1), . . . , (1, ki)} of the diagram 7 from which no edge starts, 
and let V2 = ^(7) contain the vertices (2,vi) = (2, ui) 7 , . . . , (2,v Sl ) = (2,v S2 ) 7 from 
the second row {(2, 1), ... , (2, k 2 )} of 7 from which no edges start. Then k(j) = si + &2, 
and the function (/ o g)^ has arguments of the form X(i jU ), (l,u p ) G V\ and x^2 jV ), 
l<v<k 2 . 

Relation (Al) is not appropriate for our goal, since the functions (/ o g) 7 in it 
may be non-canonical. Hence we apply Hoeffding's decomposition for the [/-statistics 
In,k(i){f 5O7 m formula (Al) to get the desired representation for the product of 
degenerate [/-statistics. Actually some special properties of the function (/ o g) 7 enables 
us to simplify a little bit this decomposition. 

To carry out this procedure let us observe that a function f(x Ul , . . . , u Uk ) is canon- 
ical if and only if P Ul f(x Ul , . . . , x Uk ) = with the operator P Ul defined in (4.2) for all 
indices u\. Beside this, the condition that the functions / and g are canonical implies 
the relations -P(i jU )(/ o g)^ = for G V\ and P(2,v)(f <?) 7 = for (2,v) G V 2 . 

Moreover, these relations remain valid if we replace the functions (/ o g) 7 by such 
functions which we get by applying the product of some transforms P(2, v ) an d Q(2,v), 
(2, v) G {(2, 1), ... , (2, fc 2 )} \ V2 for them with the transforms P and Q defined in for- 
mulas (4.2) and (4.3). (Here we applied such transforms P and Q which are indexed by 
those vertices of the second row of 7 from which some edge starts.) 

Beside this, the transforms P(2, v ) or Q(2,v) are exchangeable with the operators 
P(2,v') or Q(2,v') if v v> ') P{2,v) + Q(2,v) = !■> where I denotes the identity operator, 
and P(2,v)Q(2,v) = 0, since P( 2 ,v)Q2,v = P(2,v) — v) = ®- ^ ne a bove relations enable 
us to make the following decomposition of the function (/ o g)^ to the sum of canonical 
functions (just as it is done in the Hoeffding decomposition) : Let us introduce the class 
of those coloured diagram T(7) which we can get by colouring all edges of the diagram 
7 either with colour 1 or colour —1. Some calculation shows that 

(7^9h= I! (P(2,v) + Q(2,v))(7^9h= J2 (/°^7» ( A2 ) 

(2,v)e{(2,i),...,(2,k 2 )}\v 2 7 er(7) 

where the function (/ o g) 1 is defined in formula (4.5). We get the right-hand side of 
relation (A2) by carrying out the multiplications for the middle term of this expres- 
sion, and exploiting the properties of the operators P(2,v) an d 0(2,v)- Moreover, these 
properties also imply that the functions (/ o g) 7 are canonical functions of their vari- 
ables (l,u) G V\ and X{2, v )i (2,v) G S(t,-i)(7) U V2. (We preserve the notation 
of the main part by which 5(&,i)(7) and B^,-i)(l) denote the sets of those vertices 
(2,j) of the second row of the coloured diagram 7 from which an edge of colour 1 or 
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colour — 1 starts.) Indeed, the above properties of the operators P{2,v) an d Q(2,v) imply 
that P(i, u )(/o<7) 7 = if G V u and P (2 , v ){f o g)^ = if (2,u) G %-i)(t)UF 2 . 

Let Z( 7 ) denote the set of edges of colour 1, W('j) the set of edges of colour —1 in 
the coloured diagram 7, and let |Z( 7 )| and W( 7 )| be their cardinality. Then (/ o g)^ is 
a (canonical) function with £(7) = k\ + k 2 — (|W( 7 )| + 2|Z( 7 )|) variables, and formula 
(A2) implies the following representation of the [/"-statistic I n ^) (f <?) 7 ) in the form 
of a sum of degenerate [/-statistics: 

n- (kl+k2)/2 k(j)\I n - m ((f^9h) = n-^ + *=)/ 2 Yl UlW^InM^dfog),) 

76 r( 7 ) 

(A3) 

with J n (7) = 1 if |Z( 7 )| = 0, and 

1^(7)1 

II (n-(fci + fc 2 ) + |W( 7 )| + |Z( 7 )|+j) 
•Uy) = — if|Z( 7 )|>0. (A4) 

The coefficient J n ( 7 )nl Z( ^ 7 ^ appeared in formula (A3), since if we apply the decom- 
position (A2) for all terms (/ o g ) 7 (gj (1|M) , Cj (a , p) > e 14, (2,u) G {l,...fc 2 }) of 
the [/-statistic /c(7)^n,fc( 7 ) ((f°9)-/), then each term (f o g)^ j(lu) ,^ j(2v)1 (l,u) G 
Vi, (2,i>) G V2 U Vi) of the [/-statistic -[ n ,fc( 7 ) ((/ 0)7) appears A n ( 7 )nl z ( 7 )l times. 
(This is so, because fc( 7 ) = &i + A; 2 — (|W"( 7 )| + 2|Z( 7 )|) variables are fixed in the term 
(/ fl0 7 from the fc( 7 ) = ki + k 2 — (|W"( 7 )| + |Z( 7 )|) variables in the term (/ o g)^ : and 
to get formula (A3) from formula (A2) the indices of the remaining |Z( 7 )| variables can 
be freely chosen from the indices 1, . . . , n, with the only restriction that all indices must 
be different. 

Formula (4.6) follows from relations (Al) and (A3). (To see that we wrote the right 
power of n in this formula observe that n ~ ( - kl+k2 ^ 2 n^ z ^^ = n~ k ^/ 2 n~\ w ^\/ 2 .) 

To prove inequality (4.7) in the case |W( 7 )| = let us estimate first the value of 
the function (/ o g) 2 (x(i^, cc(2,<u), (u, 1) G Vi, (i>,2) G V 2 ) by means of the Schwarz 
inequality. We get that 

(f g)fa(i,u),X(2,v), e Vl, (2,v) G V2) 

< / f(X(l,u)iX(2,v)i (l,u)eVL, (2,V) G S(6,l)(7)) II V{dX(2,v)) 

(2,«)eB (6il) ( 7 ) 

H 2 (^,»), (2,w)eK 2 UB( M )(7),) JJ ^(dx( 2l1 ,)) 

(2,«)eB( 6il) ( 7 ) 

11 P (2,«)/ 2 (^(1, M ),^(2,,;), £ Vi, (2,v) G £( b ,l)( 7 )) 

(2,«)€B(i,i)(7) 

II ^(2,«)/(^(2,«), (2, U) G F 2 U B( M )( 7 )) 

(2,v)€B (M) ( 7 ) 

(A5) 
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with the operators P defined in formula (4.2). 

Let us observe that the two functions at the right-hand side of (A5) are functions 
of different arguments. The first of them depends on the arguments X(i )U ), (l,it) G V±, 
the second one on the arguments x^,v)i (2, v) G V 2 - Beside this, as the operators P 
appearing in their definition are contraction in Li-norm, these functions are bounded in 
Li norm by \\fW2 and \\g\\l respectively. Because of the above relations we get formula 
(4.7) by integrating inequality (A5). 

To prove inequality (4.8) let us introduce, similarly to formula (4.3), the operators 

Q Uj h(x Ul1 ...,x Ur ) = h(x Ul1 ...,x Ur ) + J h(x Ul1 ...,x Ur )ii(dx Uj ), 1 < j < r, (A6) 

in the space of functions h(x Ul , . . . , x Ur ) with coordinates in the space (X,X). (The 
indices u±, . . . , u r are all different.) Observe that both the operators Q Uj and the oper- 
ators P Uj defined in (4.2) are positive, i.e. these operators map a non-negative function 
to a non-negative function. Beside this, Q u . < Q Uj , and the norms of the operators 

— j 3 - and P Uj are bounded by 1 both in the the L 2 (/-0 and the supremum norm. 

Let us define the function 

{x(i,j),X(2,j'), j e {l,...,fci}\5 u (7), f G {l,...,k 2 }\B {bjl) ) 

II P ^') II ^(2j') (A7) 

(2,i')eB( 6 ,i)(7) (2,j')€B {i)i _ 1) (7) 



with the notation of Section 4 in the main part. We have defined the function (/ o g) 7 
with the help of (/ o g) similarly to the definition of (/ o j) 7 in (4.5), only we have 
replaced the operators Q(2,j') by Q(2,j') m it. 

We may assume that ||^||2 < II /II 2- We can write because of the properties of the 
operators P Uj and Q u , listed above and the condition sup \f(xi, . . . , Xk)\ < 1 that 

|(/o^) 7 |<(|/H^|) 7 <(l^|) 7 , (A8) 

where '<' means that the function at the right-hand side is greater than or equal to 
the function at the left-hand side in all points, and 1 denotes the function which equals 
identically 1. Because of relation (A8) it is enough to show that 



'i°M) 7 ii2= n p ( 2 'j) n £(2.j) 9(x {2yl) ,...,x {2M) ] 

(2J)eB( M) W (2,i)e-B (6 ,_ 1) (7) 
<2 lw{l)l \\gh. 



(A9) 

2 



to prove relation (4.8). But this inequality trivially holds, since the norm of all operators 
P(2,j) in formula (A9) is bounded by 1, the norm of all operators Q(2,j) is bounded by 2 
in the L 2 (aO norm, and {B^^i^l = |W(7)|. 
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